Skip to main content

Reply to this post | See parent post | Go Back
View Post [edit]

Poster: J.B. Nicholson Date: May 11, 2009 7:52am
Forum: etree Subject: Now I can see that download continuation doesn't always work.

Just to be clear, I don't work for archive.org. I'm just a satisfied user.

With regard to continued downloads on http://www.archive.org/download/Beverly_Hillbillies_Ep01_The_Clampetts_Strike_Oil/BH01_The_Clampetts_Strike_Oil_512kb.mp4 I can duplicate your result.

I don't know enough about their lighttpd configuration to be sure why things are the way they are. Apparently sometimes continuation is not available. Perhaps it is set that way on some servers, perhaps it is set that way on certain MIME types, or certain files.

As I understand it from glancing at lighttpd docs, one can set

server.range-requests = "disable"

based on almost anything. So one could say:

$HTTP["url"] =~ "\.mp4$" {
server.range-requests = "disable"
}

and disable range requests for all filenames ending in ".mp4".

I don't know why one would want to turn off continued downloads at all; I'd imagine file download continuation should be on all the time for any kind of file no matter which server the file comes from. I'd imagine this really hampers streaming files as you'd want to be able to pick up where you left off filling a streaming player buffer.

I hope someone from archive.org responds to you and can illuminate the situation. There might be some variables or policy issues of which I am unaware.

Reply to this post
Reply [edit]

Poster: Nemo_bis Date: Nov 3, 2013 11:59am
Forum: etree Subject: Wayback machine doesn't support the "Range" header AKA wget --continue doesn't work

It's still not working fully, in particular on the Wayback machine: web.archive.org sometimes contains big files, but stops sending them just a few bytes short of 100 MiB: "Connection closed at byte 104857347", says wget. You can retry at will and in few seconds/minutes you get 100 MiB more... but the same chunk.
The wget docs say ┬źNote that -c only works with FTP servers and with HTTP servers that support the "Range" header┬╗, indeed it seems web.archive.org doesn't support it. In my test I see that the response to the second request is a HTTP/1.1 200 OK, not HTTP/1.0 206 Partial Content; it doesn't change if I interrupt the download and resume it manually instead of letting wget retry; the partial file does exist in the directory.

I also tried accepting gzip per some comment on the web, see full output.

$ wget --continue --header "Accept-Encoding: gzip" --tries=0 -S http://web.archive.org/web/20070720040924/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
--2013-11-03 20:02:21-- http://web.archive.org/web/20070720040924/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
Resolving web.archive.org (web.archive.org)... 207.241.224.26
Connecting to web.archive.org (web.archive.org)|207.241.224.26|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 302 Moved Temporarily
Server: Tengine/1.5.1
Date: Sun, 03 Nov 2013 20:02:21 GMT
Content-Type: video/x-msvideo
Transfer-Encoding: chunked
Connection: keep-alive
set-cookie: wayback_server=74; Domain=archive.org; Path=/; Expires=Tue, 03-Dec-13 20:02:21 GMT;
Link: ; rel="original"
Location: /web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
X-Archive-Wayback-Perf: [IndexLoad: 9, IndexQueryTotal: 9, RobotsFetchTotal: 2, RobotsRedis: 2, RobotsTotal: 2, Total: 14]
Set-Cookie: wb_total_perf=14; Expires=Sun, 03-Nov-2013 20:03:21 GMT; Path=/web/20070720040924/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
X-Archive-Playback: 0
X-Page-Cache: MISS
Location: /web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi [following]
--2013-11-03 20:02:21-- http://web.archive.org/web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
Reusing existing connection to web.archive.org:80.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: Tengine/1.5.1
Date: Sun, 03 Nov 2013 20:02:21 GMT
Content-Type: video/x-msvideo
Content-Length: 288092160
Connection: keep-alive
Memento-Datetime: Fri, 10 Aug 2007 11:30:28 GMT
Link: ; rel="original", ; rel="timemap"; type="application/link-format", ; rel="timegate", ; rel="first last memento"; datetime="Fri, 10 Aug 2007 11:30:28 GMT"
X-Archive-Orig-Connection: close
X-Archive-Orig-Content-Length: 288092160
X-Archive-Orig-Content-Type: video/x-msvideo
X-Archive-Orig-ETag: "5d5c-112bf000-4015cefd252c0"
X-Archive-Orig-Server: Apache
X-Archive-Orig-Accept-Ranges: bytes
X-Archive-Orig-Last-Modified: Thu, 22 Sep 2005 14:16:19 GMT
X-Archive-Orig-Date: Fri, 10 Aug 2007 11:30:28 GMT
X-Archive-Wayback-Perf: [IndexLoad: 6, IndexQueryTotal: 6, RobotsFetchTotal: 2, RobotsRedis: 2, RobotsTotal: 2, Total: 31, WArcResource: 24]
Set-Cookie: wb_total_perf=31; Expires=Sun, 03-Nov-2013 20:03:21 GMT; Path=/web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
X-Archive-Playback: 1
X-Page-Cache: MISS
Length: 288092160 (275M) [video/x-msvideo]
Saving to: `Wikimania05-AP1.avi'

36% [====================================> ] 104,857,347 --.-K/s in 1m 45s

2013-11-03 20:04:07 (976 KB/s) - Connection closed at byte 104857347. Retrying.

--2013-11-03 20:04:08-- (try: 2) http://web.archive.org/web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
Connecting to web.archive.org (web.archive.org)|207.241.224.26|:80... connected.
HTTP request sent, awaiting response...
HTTP/1.1 200 OK
Server: Tengine/1.5.1
Date: Sun, 03 Nov 2013 20:04:08 GMT
Content-Type: video/x-msvideo
Content-Length: 288092160
Connection: keep-alive
Memento-Datetime: Fri, 10 Aug 2007 11:30:28 GMT
Link: ; rel="original", ; rel="timemap"; type="application/link-format", ; rel="timegate", ; rel="first last memento"; datetime="Fri, 10 Aug 2007 11:30:28 GMT"
X-Archive-Orig-Connection: close
X-Archive-Orig-Content-Length: 288092160
X-Archive-Orig-Content-Type: video/x-msvideo
X-Archive-Orig-ETag: "5d5c-112bf000-4015cefd252c0"
X-Archive-Orig-Server: Apache
X-Archive-Orig-Accept-Ranges: bytes
X-Archive-Orig-Last-Modified: Thu, 22 Sep 2005 14:16:19 GMT
X-Archive-Orig-Date: Fri, 10 Aug 2007 11:30:28 GMT
X-Archive-Wayback-Perf: [IndexLoad: 10, IndexQueryTotal: 10, RobotsFetchTotal: 5, RobotsRedis: 5, RobotsTotal: 5, Total: 87, WArcResource: 74]
Set-Cookie: wb_total_perf=87; Expires=Sun, 03-Nov-2013 20:05:08 GMT; Path=/web/20070810113028/http://www.knams.wikimedia.org/wikimania/highquality/Wikimania05-AP1.avi
X-Archive-Playback: 1
X-Page-Cache: MISS
Length: 288092160 (275M) [video/x-msvideo]
Saving to: `Wikimania05-AP1.avi'

36% [====================================> ] 104,857,347 --.-K/s in 85s

2013-11-03 20:05:33 (1.18 MB/s) - Connection closed at byte 104857347. Retrying.

$ wget --version
GNU Wget 1.13.4 built on linux-gnu.

+digest +https +ipv6 +iri +large-file +nls +ntlm +opie +ssl/openssl

Wgetrc:
/etc/wgetrc (system)
Locale: /usr/share/locale
Compile: gcc -DHAVE_CONFIG_H -DSYSTEM_WGETRC="/etc/wgetrc"
-DLOCALEDIR="/usr/share/locale" -I. -I../../src -I../lib
-I../../lib -D_FORTIFY_SOURCE=2 -Iyes/include -g -O2
-fstack-protector --param=ssp-buffer-size=4 -Wformat
-Wformat-security -Werror=format-security -DNO_SSLv2
-D_FILE_OFFSET_BITS=64 -g -Wall
Link: gcc -g -O2 -fstack-protector --param=ssp-buffer-size=4 -Wformat
-Wformat-security -Werror=format-security -DNO_SSLv2
-D_FILE_OFFSET_BITS=64 -g -Wall -Wl,-Bsymbolic-functions
-Wl,-z,relro -Lyes/lib -lssl -lcrypto -lz -ldl -lz -lidn -lrt
ftp-opie.o openssl.o http-ntlm.o ../lib/libgnu.a